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Bifidobacterium longum subsp. longum CMCC P0001, a standard probiotic strain in China, has been widely used in clinical 
medicine for more than 20 years. Here we report the genome features of B. longum strain CMCC P0001. 
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Bifidobacterium longum subsp. longum strain CMCC P0001, 
isolated from feces of healthy children, has been commercially 
used in the probiotic compound BIFICO (Shanghai Sine Pharma- 
ceutical Co., Ltd., Shanghai, China) for more than 20 years (1, 2). 
In 201 1, the strain was designated a standard strain for probiotic 
production by the China Medical Culture Collection Center 
(CMCC) with the assigned number CMCC P0001. 

Probiotics and their metabolites have been demonstrated to be 
crucial to human health. Among all probiotics, those of the genus 
Bifidobacterium are indeed remarkable, because they play impor- 
tant roles in preventing infection, enhancing immunity, inhibit- 
ing the growth of pathogenic bacteria, and treating inflammatory 
diseases (3). Here, the draft genome sequence of the standard 
probiotic strain B. longum CMCC P0001 is presented. 

The genome sequence of CMCC P000 1 was determined by the 
use of the Illumina HiSeq 2000 system with paired-end and shot- 
gun libraries (176X coverage). As a result, a total of 5,722,028 
reads with an average length of 74 bp were assembled into 136 
contigs using SOAP denovo-1.04 (4, 5). Protein-coding genes 
were predicted by GLIMMER 3 (6). Protein functions were anno- 
tated by use of a sequence similarity search using BLAST programs 
(7) against the proteins of the other B. longum strains and the 
nonredundant protein database of the NCBI. tRNAs and rRNAs 
were identified by tRNAScan-SE (8) and BLAST (9), respectively. 

The draft genome sequence of B. longum CMCC P0001 con- 
sists of a 2,418,214-bp circular molecule without any plasmids. 
The overall G+C content of the genome is 59.75%. CMCC P0001 
harbors 54 tRNA genes. A total of 1,569 coding sequences (CDS) 
were predicted in the genome and 1,391 (88.7%) CDS were pre- 
dicted to be functional, whereas protein functions for 1 78 ( 1 1 . 3 % ) 
of the CDS were classified to be unclear. 

A gene encoding a serpin (serine protease inhibitor) with al- 
most 100% similarity to that of NCC2705 (10) was found in the 
genome sequence. As a potential probiotic effector molecule, 
the serpin may contribute to the immunomodulation of this 
B. longum strain. Furthermore, a large number of predicted pro- 



teins (>8% of the total predicted proteins) encoded in the ge- 
nome were in the carbohydrate transport metabolism category. 
The ability to enhance carbohydrate transport metabolism likely 
contributes to the competitiveness and persistence of bifidobacte- 
ria in the colon (11). 

The availability of the whole-genome sequence of CMCC 
P0001 will facilitate further analysis and understanding of the 
health-promoting characteristics of the probiotic strain B. longum 
CMCC P0001. 

Nucleotide sequence accession number. The draft genome se- 
quence of B. longum CMCC P000 1 has been deposited at GenBank 
under the accession number APVE00000000. 
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